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DIMENSIONALITY REDUCTION 


= The number of input features, variables, or columns present in a given 
dataset is known as dimensionality, and the process to reduce these 
features is called dimensionality reduction. 
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WHY DIMENSIONALITY REDUCTION? 


= A dataset contains a huge number of input features in various cases, 
which makes the predictive modeling task more complicated. Because it 
is very difficult to visualize or make predictions for the training dataset 
with a high number of features, for such cases, dimensionality reduction 
techniques are required to use. 


= Dimensionality reduction technique сап be defined as, "It is a way of 
converting the higher dimensions dataset into lesser dimensions 
dataset ensuring that it provides similar information.” These 
techniques are widely used in machine learning for obtaining a better fit 
predictive model while solving the classification and regression problems. 
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DIMENSIONALIT Y 


Perfomance 


Number of Features/Dimensions 


Optimal number of features 
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WHY DIMENSIONALITY REDUCTION? 


= Visualization: projection of high-dimensional data onto 2D or 3D. 


= Data compression: efficient storage and retrieval. 


= Noise removal: positive effect on query accuracy. 
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DOCUMENT CLASSIFICATION 
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= Task: To classify unlabeled 
documents into categories 


Challenge: thousands of terms 


Solution: to apply 
dimensionality reduction 
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DIMENSIONALITY REDUCTION 


= |t is commonly used in the fields that deal with high-dimensional 
data, such as speech recognition, signal processing, 
bioinformatics, etc. It can also be used for data visualization, 
noise reduction, cluster analysis, etc. 
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Dimensionality reduction 
Techniques 
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THE CURSE OF DIMENSIONALITY 


= Handling the high-dimensional data is very difficult in practice, 
commonly known as the curse of dimensionality. If the 
dimensionality of the input dataset increases, any machine learning 
algorithm and model becomes more complex. As the number of 
features increases, the number of samples also gets increased 
proportionally, and the chance of overfitting also increases. If the 
machine learning model is trained on high-dimensional data, it 
becomes overfitted and results in poor performance. 
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BENEFITS OF APPLYING DIMENSIONALITY REDUCTION 


Some benefits of applying dimensionality reduction technique to the given dataset are 
given below: 


= By reducing the dimensions of the features, the space required to store the dataset 
also gets reduced. 


= Less Computation training time is required for reduced dimensions of features. 
= Reduced dimensions of features of the dataset help in visualizing the data quickly. 


= |t removes the redundant features (if present) by taking care of multicollinearity. 
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DISADVANTAGES OF DIMENSIONALITY REDUCTION 


There are also some disadvantages of applying the dimensionality reduction, which are 
given below: 


= Some data may be lost due to dimensionality reduction. 


= In the PCA dimensionality reduction technique, sometimes the principal components 
required to consider are unknown. 
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APPROACHES OF DIMENSION REDUCTION 


= Feature Selection 


= Feature Extraction 
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FEATURE SELECTION 


Feature selection is the process of selecting the subset of the relevant 
features and leaving out the irrelevant features present in a dataset to build 
a model of high accuracy. In other words, it is a way of selecting the 
optimal features from the input dataset. 
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FEATURE SELECTION — METHODS 


= Filters Methods 


= Wrappers Methods 
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FILTERS METHODS 


In this method, the dataset is filtered, and a subset that contains only the relevant 
features is taken. Some common techniques of filters method are: 


= Correlation 
= Chi-Square Test 
= ANOVA 
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WRAPPERS METHODS 


The wrapper method has the same goal as the filter method, but it takes a machine learning 
model for its evaluation. In this method, some features are fed to the ML model, and evaluate 
the performance. The performance decides whether to add those features or remove to 
increase the accuracy of the model. This method is more accurate than the filtering method but 
complex to work. Some common techniques of wrapper methods are: 


= Forward Selection 


= Backward Selection 


= Bi-directional Elimination 


DR. HAIDER ALI Machine Learning 18 


2.VVRAPPER METHODS (FEATURE SELECTION) 


= А. Forward Selection 


Forward Selection is an iterative method. 


In this method, we start with one feature and we keep on adding features until no improvement in the model is 
observed. 


The search is stopped after a pre-set criteria is met. 


This is a greedy approach because it always targets the features in a forward fashion, which gives a boost to the 
performance. 


If the number of features are large, it can be computationally expensive. 


. Backward Elimination 


This process is the opposite of the Forward Selection Method. 


It starts initially with all the features and keeps on removing features until no improvement is observed. 
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WRAPPER METHOD: FORWARD FEATURE SELECTION 
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Steps to perform Forward Feature Selection 


. Train п model using each feature (п) individually and check the performance 
. Choose the variable which gives the best performance 

. Repeat the process and add one variable at a time 

. Variable producing the highest improvement is retained 


. Repeat the entire process until there is no significant improvement in the 


model's performance 
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WRAPPER METHOD: FORWARD FEATURE SELECTION 


| ID Calories bumt GenderPlays Sport? Fitness Level 


| e 1 121 M Yes Fit 
Fitness level prediction = >38 M NO Fit 
3 342 F No Unfit 

So the first step in Forward Feature 4 70 M Yes Fit 
Salacti ; , del h 5 278 F Yes Unfit 
election is to train models using eac 6 146 M Yes Fit 
feature individually and checking the | 7 168 F No Unfit 
8 231 F Yes Fit 

performance. | | Q 150 M No Fit 
If you have three independent variables, | 10 190 F No Fit 


we will train three models using each of 
these three features individually. 
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FORWARD FEATURE SELECTION: EXAMPLE 


лр [Саїопе< Бит GenderPlays Sport? [Fitness Level] 


1 121 M Yes Fit 

* Let's say we trained the model using 2 230 M No Fit 
the Calories Burnt feature and the 2 342 F No Unfit 

Я 4 70 М Yes Fit 

target variable, Fitness_Level and 5 278 F Yes Unfit 
we've got an accuracy of 87% 6 146 M Yes Fit 

| 7 168 F No Unfit 

| 8 231 F Yes Fit 

9 150 M No Fit 

10 190 F No Fit 


Accuracy = 87% 
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FORWARD FEATURE SELECTION: EXAMPLE CONT. 


Next, well train the model using 
the Gender feature, and we get an 
accuracy of 80% 


Variable used Accurac 
Gender 80.00% 
Plays Sport? 85.00% 
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Accuracy = 80% 
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EXAMPLE CONT. 


ло [Саіопеѕ bumt GenderPlays Sport? [Fitness Level 
= Next, we will repeat this process and 1 121 M Yes Fit 
add one variable at a time. So of course 2 230 M No Fit 
we'll keep : E ж * = EC 

А ( еѕ 

the Calories_Burnt variable and ED 5 278 F Yes Unfit 
adding one variable. So lets 6 146 M Yes Fit 
take Gender here and using this we get 7 168 F No Unfit 
an accuracy of 88%- 8 231 F Yes Fit 
9 150 M No Fit 
10 190 F No Fit 


Accuracy = 88% 
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EXAMPLE CONT 


1 
2 

Plays_Sport along with Calories_Burnt, а 
we get an accuracy of 91%. A variable | 5 
that produces the highest improvement = 
will be retained. | 8 
| = 
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Accuracy = 91% 
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BACKWARD FEATURE ELIMINATION: EXAMPLE 


= Fitness prediction level ID [Calories bumt GenderPlays Ѕрогї? | [Fitness Level 


= The first step is to train the model, using all 
the variables. 


= You'll of course not take the ID variable 
train the model as ID contains a unique 
value for each observation 


= So we'll first train the model using the other 
three independent variables.And of course, 
the target variable, which is 
the Fitness_Level. 


= we get an accuracy of 92% using all 
three independent variables. 
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Variable_dropped Accuracy Variable_dropped Accuracy 
Calories_burnt 90% Gender 91.60% 


ID Calories bumt [GenderPlays Sport? 


ID | 
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Variable_dropped Accuracy ID 
Plays_Sport? 88% 
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BACKWARD FEATURE ELIMINATION: EXAMPLE 


* |f you see gender has produced the smallest change in the 
performance in the model first, it was 9296 when we took all the 
variables and when we dropped gender, it was 91.696. So we can ACY Wc аа 


Accuracy using all the variables = 92% 


infer that gender does not have a high impact on the Fitness Level calories burnt 90% 
variable. And hence it can be dropped. Gender 91.60% 
Plays_Sport? 88% 


* Finally, we will repeat all these steps until no more variables can be 
dropped. 


* It’s a very simple, but very effective technique. 
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FEATURE EXTRACTION 


Feature extraction is the process of transforming the space containing many dimensions 
into space with fewer dimensions. This approach is useful when we want to keep the 
whole information but use fewer resources while processing the information. 


Some common feature extraction techniques are: 
1. Principal Component Analysis 

2. Linear Discriminant Analysis 

3. Kernel PCA 


4. Quadratic Discriminant Analysis 
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THANK YOU 
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